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The dehydration responsive element binding (Dreb) genes which are representatives of the AP2 / 
ERF transcription factor family and playing a key role in drought-induced transcriptome in wheat 
were studied using in silico methods. For this purpose, information on relevant genes (Accession 
Nr. AF303376.1, AB193608.1, KM520370.1, DQ195068.1) was obtained from the NCBI. FASTA 
data of proteins corresponding to each gene were analyzed comparatively by the MAFFT 
CLUSTAL format alignment software, and the main conservative areas were identified. Two 
conservative functional amino acids specific for AP2 domain - valine and glutamine - were 
identified in positions 14 and 19 in all studied genes. Specific amino acid substitutions have been 
identified in the protein (DQ195068.1) that binds to the dehydration element specific to the D 
genome in the areas involved in the formation of the nuclear localization signal (NLS) and the a- 
helix structure. The results obtained could be a scientific basis for future laboratory studies of Dreb 


genes in wheat. 


Keywords: Dreb, AP2 domen, nuclear localization signal (NLS), a-helix, in silico analysis 


INTRODUCTION 


Although the advances in genomics contri- 
buted to the improvement of some agriculturally 
important crops, similar efforts in wheat (Triticum 
spp.) were more challenging. This is attributed to 
the size and complexity of the wheat genome, and 
the lack of genome-assembly data for multiple 
wheat lines (Walkowiak et al., 2020, Alotaibi et 
al., 2021). The current knowledge of wheat bio- 
logy and the molecular basis of central agronomic 
traits are not sufficient for wheat breeding. There 
is an urgent need for wheat research and breeding 
to accelerate genetic gain as well as to increase 
and protect wheat yield and quality traits for 
meeting the demands of human population growth 
(Zhu et al., 2021). Clarification of gene functions 
and availability of wheat genome sequence infor- 
mation as well as genome editing methods will 
open up new opportunities for improving crops 
under stress conditions (Rathan, 2021). 

Some candidate genes involved in the adap- 
tive responses to abiotic stress have been 
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determined in cereals. Transcription regulators 
have been found to play an important role in the 
adaptation of plants against changing environ- 
mental conditions. According to large-scale 
transcriptome analyses, protective proteins and 
regulatory proteins are the main types of mole- 
cular stress responses (Shahzad et al., 2021). 
Proteins such as chaperones, osmotin, antifreeze 
proteins, mRNA-binding proteins protect cells 
against stress conditions. Regulatory proteins such 
as transcription factors, including myeloblastosis 
oncogene (MYB), basic leucine zipper (bZIP), 
NAM, ATAF, CUC (NAC), and dehydration 
responsive element binding (DREB) proteins 
rearrange the gene expressions to protect the plant 
from stresses. The interaction between transcrip- 
tion factors and cis-elements of target gene pro- 
moters is very important for the regulation of 
stress-related gene expression. 

Dehydration responsive element-binding 
proteins are essential transcription factors that 
stimulate stress-related genes (Niu et al., 2020). 
Two types of DREB transcription factors were 
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observed: ОКЕВ1 and DREB2. They are contain- 
ed in different signal transduction pathways under 
low temperature and dehydration, respectively. 
The С-тереаг (CRT) and low-temperature- 
responsive element (LTRE) known as cis-acting 
elements both contain an A/GCCGAC motif 
similar to the core of the DRE sequence regulating 
cold-inducible gene expression. DREB1/DREB2 
homologous genes were identified in several 
grasses, including wheat, rice, barley, sorghum, 
maize, oat, rye, and perennial ryegrass. Dreb1/CBF 
genes are stimulated by cold, whereas the Dreb2 
genes are generally stimulated by dehydration, high 
salinity, and heat (Hassan et al., 2021). DREB 
family proteins belonging to the AP2/ERF 
transcription factor family contain one AP2/ERF 
DNA-binding domain. 

We aimed at the detailed analyses of Dreb 
genes in the bread wheat genome. For this 
purpose, we performed in silico comprehensive 
analysis of four Dreb genes using the available 
nucleotide and protein sequences from the current 
databases. 


MATERIALS AND METHODS 


Sequence Sources 

The complete cDNA and corresponding protein 
sequences of DREB in wheat (Triticum aestivum 
L.) were retrieved from the GenBank database 
(http://www.ncbi.nlm.nih.gov/genbank/). 
Multiple Sequence Alignment 

Multiple Sequence Alignment is generally the 
alignment of three or more biological sequences 
(protein or nucleic acid) of similar length. Based 
on the results, homology can be assessed and the 
evolutionary relationships between the sequences 
studied. MAFFT V7.427 
(https://mafft.cbrc.ip/alignment/server/) 
CLUSTAL format alignment program vvas used 
for sequencing. MAFFT (Multiple Alignment 
using Fast Fourier Transform) is a high-speed 
multiple sequence alignment program. 
Annotation of Functional Motifs 

For searching functional motives in the selected 
Dreb genes, the relevant functions of Softberry, 
Inc. software (http://www.softberry.com/) 
intended for annotation of plant genomes was 
used. 





Softberry, Inc. is known as a leading 
developer of software tools for genomic research 
focused on computational methods of high 
throughput biomedical data analysis, including 
software to support next-generation sequencing 
technologies, transcriptome analysis (with 
RNASeq data), SNP detection and selection of 
disease-specific SNP subsets. 

NSITE-PL service was used to search for 
promoter/functional motives 
(http://www.softberry.com/berry.phtml?topic=nsit 
ep&group=programs&subgroup=promoter) 
(Shahmuradov and Solovyev, 2015; Solovyev et 
al., 2010) 

Annotation of Sub-Cellular Localization 
ProtComp v. 9.0 service was used to study the 
localization of the studied gene products 
throughout cell compartments. 


RESULTS AND DISCUSSION 


Despite both the fundamental knowledge 
gained from relevant studies concerning the wheat 
genome and the importance of the crop, a 
comprehensive genome-wide analysis of gene 
content was not conducted until recently (Dabab 
Nahas et al., 2019). This was because of the large 
size, repeat content, and polyploid complexity of 
the genome. However, assembly of the 17-Gb 
allohexaploid genome of Triticum aestivum faced 
major difficulties, because it is composed of three 
large, repetitive, and closely related genomes. In 
the current study, the Dreb genes of bread wheat 
were analyzed in silico. For this purpose, four 
genes encoding the Dreb genes were selected from 
the GenBank database, and information about them 
was obtained. The first information we analyzed 
was Access Nr. JQ004969.1, the beta isoform of 
the DREB AP2 binding factor, labeled “Triticum 
aestivum DREB AP2 binding factor beta isoform 
mRNA, complete cds”. The length of this mRNT is 
1286 bp. The protein-encoding area is shown to be 
smaller, in other words, it is located in the area 
between 64-264 nucleotides. The id of the protein 
corresponding to this gene: "AEX59145.1". This 
gene is translated to the amino acid sequence as 
follows: 
"MTVDRKHAEAAAAAPFEIPALQPGRTCGA 
EESTRSHVLVKPIGKSDLGDHVMGLIQSLKR 
SGDGKK". 
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The second researched information is an 
AP2-containing protein-encoding gene labeled 
“Triticum aestivum AP2-containing protein 
(Dreb1) mRNA, complete cds” with Access Nr. 
AF303376.1. The length of this locus is 1292 bp. 
The region encoding the protein covers an area 
between 252-1088 nucleotides. The id of the 
protein corresponding to this gene in GenBank is: 
"AAL01124.1". The translation of the gene into 
an amino acid sequence was as follows: 
"METGGSKREGDCPGQERKKKVRRRSTGPD 
SVAETIKKVVKEENQKLQQENGSRKAPAKGS 
KKGCMAGKGGPENSNCAYRGVRQRTVGK 
WVAEIREPNRGNRLWLGSFPTAVEAARAYD 
DAARAMYGAKARVNFSEQSPDANSGCTLA 
PPLPMSNGATAASHPSDGKDESESPPSLISNA 


PTAALHRSDAKDESESAGTVARKVKKEVSN 
DLRSTHEEHKTLEVSQPKGKALHKAANVSY 
DYFNVEEVLDMITVELSADVKMEAHEEYQD 
GDDGFSLFSY". 

The Dreb gene searched in NCBI under the 
name “Triticum aestivum Wdreb2 mRNA for 
EREBP / AP2 type transcription factor, complete 
cds” is information available under Accession Nr. 
AB193608.1. The length of the current locus was 
1456 bp. The region encoding the protein covers 
the area between 123-1157 nucleotides. The 
amino acid sequence of the EREBP / AP2 type 
transcription factor protein corresponding to this 
gene is shown as follows: 













CLUSTAL format alignment by MAFFT (v7.481) 
AALÜ1124.1 МЕЛТСБЗАРЕВЫС EE E PGQE RSTGPDSVAET IKKWKEENQKLOQE-— 
ABAÜ8424.1 METGGSKREGDC—--------- PGQE RSTGPDSVAETIKKMNKEENQKLQQE-- 
BAD97369.1 МТУОККНАЕАААААРЕЕТРАЫОРС-- SRDGPNSVSETIRRWKEVNQQLEHDPO 
x a Mie 1 зи 2 Xok **х xox. xk: KKK: z xx xk: *- инин 
AALÜ1124.1 -—-NGSRKAPAKGSKKGCMAGKGGPENSNCAYRGVRQRTUGK PNRGNRLWLGSF 
ABA8424.1 —-NGSRKAPAKGSKKGCMAGKGGPENSNCAYRGVRORTWG. PNRGNRLWLGSF 
BD97369.1 777775. PNRVSRLVLGTF 
= ХХ, KKK KKK KK KK KKKKK KK: ı KK KK хж .Хҝххжух 
AALÜ1124.1 PT VNFSEQSPDANSGCTLAPPLPMSNGATAASHP-SDGK 
BA0424.1 PT KARVNFSEQSPDANSGCTLAPPLPMSNGATAASHP-SDGK 
BAD97369.1 ET 2256” ии 
Zik фххххххк-: KKK kx xx. Ел ən a ee = əə 
AALÜ1124.1 DESESPPSLISNAPTAALH- RSDAKDESESAGTVARKVKKEVSNDL ===ә===ә= RSTHE 
АВ08424.1 DESESPPSLISNAPTAALH-RSDAKDESESAGTVARKVKKEVSNDL---------— RSTHE 
AD97369.1 NHS DVASSDERUAQAP ETI SVEDA LESTES VV LESVEN nee eee ISRO SE 
2——— —..——5 -—- 
AALÜ1124.1 EH--KTLE--VSQPKGKALHKAANVSYDYFNVEEVL----- РМТТҸУЕҺ---------- SA 
AA08424.1 EH--KTLE--VSQPKGKALHKAANVSYDYFNVEEVL----- ib, 2 UDI әә—ә—ә—әә— SA 
BAD97369.1 EDVEFEPLEPISSLPDGEA--———————-— DGFDIEELLRLMEADPIEVELVTGGSWNGGANT 
сн kk x Cea elit = xı (ХХ: x Le ES 32-26 
AALÜ1124.1 DVKMEAHEE-YQDGDDGFSL------- E  Ç SSSes= 
ABAÜ8424.1 DVKMEAHEE-YQDGDDGFSL------- FSY-------------------------—— 
BAD97369.1 1577777777 57537”: 
ə Tə. “or mor 





Fig. 1. CLUSTAL format alignment by MAFFT. Specific nuclear localization signal (NLS) area for the Dreb 
genes is highlighted in green, the amino acid substitutions in this area are highlighted in red. The valine 14 and 
glutamic acid 19 amino acids of the AP2 domain are yellow. The specific sequence for a-helix is pink, and the 
amino acid changes observed in this region are highlighted in blue. 
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"MTVDRKHAEAAAAAPFEIPALQPGRKKRP 
RRSRDGPNSVSETIRRVVKEVNQQLEHDPQG 
AKRARKPPAKGSKKGCMLGKGGPENTQCG 
FRGVRQRTWGKWVAIREPNRVSRLWLGTFP 
TAEDAARA YDEAARAMYGALARTNFPVHP 
AQAPAVAVPAAIEGV VRGASASCESTTTSTN 
HSDVASSLPRQAQAPEIYSQPDALESTESVV 
LESVEHYSHQDTVPDAGSSISRSTSEEDVFEP 
LEPISSLPDGEADGFDIEELLRLMEADPIEVE 
LVTGGSWNGGANTGVEMGQQEPLYLDGLD 
QGMLEGMLQSDY PYPMWISEDRAMHNSAF 
HDAEMSEFFEGL". The protein corresponding 
to this gene is placed in GenBank with the id 
"BAD97369.1”. 

Finally, the last gene vve studied is 
information placed under the name “Triticum 
aestivum genome О dehydration-responsive 
element-binding protein (Dreb1) gene, complete 
cds” with Accession Nr. DQ195068.1. This locus 


is longer and amounts to 1748 bp. The region 
encoding the protein consists of areas between 
nucleotides 20-69 and 771-1557, and covers two 
exons. The id of this protein, which combines the 
element responsible for dehydration, 15 
"ABA08424.1" in GenBank. The translation of a 
nucleotide sequence into an amino acid sequence 
was as follows: 
“METGGSKREGDCPGQERKKK VRRRSTGPD 
SVAETIKK WKEENQKLQQENGSRKAPAKGS 
KKGCMAGKGGPENSNCAYRGVRQRTWGK 

WVAEIREPNRGNRLWLGSFPTAVEAARAYD 
DAARAMYGAKARVNFSEQSPDANSGCTLA 

PPLPMSNGATAASHPSDGKDESESPPSLISNA 
PTAALHRSDAKDESESAGTVARKVKKEVSN 
DLRSTHEEHKTLEVSQPKGKALHKAANVSY 
DYFNVEEVLDMITVELSADVKMEAHEEYQD 
GDDGFSLF” 























Fig. 2. Annotation of functional motifs for Triticum aestivum DREB AP2 binding factor gen 
(Accession Nr. JQ004969.1) using NSITE-PL (http://www.softberry.com). 
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ProtComp Version 9.0. 
Seq name: test sequence, Length=278 


Database sequence: 


Score=98, Sequence length=275, 
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3.0 
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5724 multiple located sequences are accepted 
Identifying sub-cellular location (Plant) 


Significant similarity in Location DB - Nuclear 
ACEQÜTQFT Location:Nuclear DE 
Alignment length=270 
Predicted by Neural Nets - Extracellular (Secreted) with score 1.0 
Integral Prediction of protein location: 


Nuclear with score 8.9 
/ Neural Nets / Pentamers / Integral 
/ 0.00 / 0.06 / 8.93 
/ 0:97)/ 0.03 / 0.71 
/ 0.97 / 1:021/ 0.00 
/ 0.00 / 2.14 / 0.00 
/ 0.00 / 1.43 / 0.00 
/ 0.00 / 0.09 / 0.00 
/ 0.97 / 0.00 / 0.10 
/ 027101/ 0.31 / 0.00 
/ 0.00 / 02211/ 0.01 
/ 0.00 / 0.00 / 0.26 


Dehydration-responsive element- 








Fig. 3. Annotation of sub-cellular localization for Triticum aestivum 
element-binding protein (protein id 


responsive 
(http://www.softberry.com). 


CLUSTAL format alignment by MAFFT 
(v7.481) revealed significantly conservative areas 
in the amino acid sequences of these genes. The 
beta isoform (protein id: "AEX59145.1") of the 
DREB AP2 binding factor, with Accession Nr. 
JQ004969.1 in GenBank has been excluded due to 
inconsistencies in this alignment. Homology is 
more pronounced between the Drebl gene with 
Accession Nr. AF303376.1 (protein id. 
"AAL01124.1") and VVdreb2 genes vvith 
Accession Nr. AB193608.1 in GenBank. In the 
amino acid sequences of both proteins, a specific 
nuclear localization signal (NLS) for the Dreb 
genes is observed in the peptide sequence 
(RKKKVR) (highlighted in green in Figure 1). In 
the amino acid sequence encoded by the 
dehydration-responsive element-binding protein 
(Dreb1) gene (Accession Nr. DQ195068.1) of the 
Triticum aestivum genome D, in the 4th position 
of the signal peptide sequence (RKKRPR), 
arginine is located instead of lysine and in the Sth 
position, proline amino acid is located instead of 
valine (substituted amino acids in Figure 1 are 
highlighted red). 

Besides, alignment of amino acid sequences 
corresponding to the Dreb gene by MAFFT 
revealed the AP2 domain with the two conserved 
functional amino acids (valine (V) and glutamic 
acid (E)) at the 14th and 19th residues which play 
crucial roles in recognition of the DNA - binding 
sequence (Fig. 1). However, according to the 
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genome D dehydration- 
“BAD97369.1”) using ProtComp 9.0 


results of further research, E19 might not be as 
necessary as V14 for this case (Sakuma et al. 
2002). Nevertheless, both of these amino acids are 
found in the three proteins vve studied 
(highlighted yellovv in the figure). 

The characteristic sequence for the o-helix 
(VEAARAYDDAARAMYG) was also identified 
in our analysis in silico. Interestingly, BAD97369.1 
protein contains amino acid substitutions also in 
this area. Valine, the primary amino acid involved 
in the formation of a-helix, was replaced by 
glutamine in this protein, and the second glutamine 
was replaced by the amino acid aspartate. In the 
ninth position of this domain, on the contrary, 
aspartate was replaced by glutamine. It should be 
noted that the gene of this protein, which changes 
both in the NLS sequence and in the a-helix region, 
is a Dreb gene specific to the D genome. For the 
first time, Pandey et al. (2014) built the tertiary 
structure of DREB2 protein from wheat by 
homology modeling based on the crystal structure 
of GCC-box binding domain of Arabidopsis 
thaliana, which contributed to understanding the 
structure-function relationships. Protein docking 
with the DNA containing GCC-box revealed more 
similarities between AP2/EREBP protein of A. 
thaliana and T. aestivum. A protein was found to 
interact through their B-sheet, with the major DNA 
groove by hydrogen and hydrophobic bond, which 
provides structural stability to the molecule. This 
model comprises a three-stranded antiparallel р- 
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heet followed бу о-һеһх and 
unstructured C’-terminal. 

To search for functional motives in the 
selected DREB genes, NSITE-PL service for 
annotation of plant genomes of the Softberry, Inc. 
(http://www.softberry.com/) software was used 
(Shahmuradov and Solovyev, 2015). Ten motifs 
for seven different transcription factor-binding 
sites (TFBS) were found in Triticum aestivum 
DREB AP2 binding factor beta isoform with 
Accession Nr. JQ004969.1 (Figure 2). Seventeen 
motifs for 15 different TFBS were found in 
Triticum aestivum AP2-containing protein 
(Dreb1) with Accession Nr. AF303376.1. Nine 
motifs for eight different TFBS in Triticum 
aestivum Wdreb2 mRNA for EREBP/AP2 type 
transcription factor (Accession Nr. AB193608.1) 
and seventeen motifs for fifteen different TFBS in 
Triticum aestivum genome D dehydration- 
responsive element-binding protein (Dreb1) gene 
(Accession Nr. DQ195068.1) were found. 

Annotation of sub-cellular localization was 
performed by the ProtComp v. 9.0 service of 
Softberry for the studied products. Nuclear 
location was determined for three of the 
transcription factors studied (Figure 3), and 
extracellular localization was found only for the 
DREB AP2 binding factor beta isoform. This 
result requires more in-depth research. 

In silico identification and characterization of 
the genes in various organisms under different 
conditions got importance due to growing data in 
the data bases (Dabab Nahas et al., 2019). Our 
analyses could be a scientific base to understand 
Dreb genes and proteins to further wet lab studies 
in wheat plants. 


relatively 
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Yumşaq buğdada Dreb transkripsiya faktoru genlərinin in silico analizi 
S. M. Rüstəmova 
AMEA-nın Molekulyar Biologiya və Biotexnologiyalar İnstitutu, Bakı, Azərbaycan 


Buğdada quraqlıqla induksiya olunan transkriptomda əsas yer tutan AP2/ERF transkripsiya faktorları ailə- 
sinin üzvlərindən dehidratasiyaya cavabdeh element birləşdirən (Dreb) genlər iz silico analiz edilmişdir. 
Bu məqsədlə NCBI məlumat bazasından genlər (qeydiyyat nömrələri: AF303376.1, AB193608.1, 
KM520370.1, DQ195068.1) haqqında məlumatlar əldə edilmişdir. Hər bir genə uyğun proteinlərin FAS- 
TA məlumatları MAFFT CLUSTAL format düzlənmə proqramı ilə müqayisəli analiz edilərək əsas kon- 
servativ sahələr müəyyən edilmişdir. Tədqiq olunan genlərin hamısında AP2 domen üçün spesifik iki 
konservativ funksional amin turşusu - valin və qlutamin 14-cü və 19-cu vəziyyətlərdə müəyyən edilmiş- 
dir. D genomu üçün spesifik olan dehidratasiyaya cavabdeh elementi birləşdirən proteində (DQ195068.1) 
nüvədə lokalizasiya siqnalının (NLS) və fəza quruluşunda a-spiral quruluşun yaranmasında iştirak edən 
sahələrdə spesifik amin turşu əvəzlənmələri müəyyən edilmişdir. Əldə olunan nəticələr buğda bitkisində 
Dreb genlərin gələcək laboratoriya tədqiqatları üçün elmi əsas ola bilər. 


Açar sözlər: Dreb, AP2 domen, nüvədə lokalizasiya siqnalı (NLS), o-spiral quruluş, in silico analiz 


In silico анализ генов факторов транскриппии Dreb B MsirKoH тпенипе 
C. M. Рустамова 
Институт молекулирнон биологии u биотехнологин HAH Asepöat3əxcana, Баку, Азербаиджан 


C исполызованием методов in silico были изуҹены относаииеси K представителҹм семеиства 
факторов транскриппии AP2/ERF и игракидие клоҹевуко ролы B индупированноИи засухои 
транскриптоме пиенипы гены, свазывакидие злемент, ответственныИ за дегидратапикоә (Dreb). C 
әтои пелыо информапин о COOTBETCTByTOHIVX генах (HoMep доступа AF303376.1, АВ193608.1, 
KM520370.1, DQ195068.1) была полуҹена из NCBI. Данные FASTA белков, соответствукидих 
каждому гену, были сравнителыно проанализированы с помоптыо программного обеспеҹенин DIA 
выравниваниҹн MAFFT CLUSTAL, идентифипированы основные консервативные области ҹтеним. 
Две консервативные функпионалыные аминокислоты, спепифиҹные Wd домена АР2 - валин и 
глутамин - идентифипированы B положенилх 14 m 19 во всех изуҹенных генах. Определенные 
аминокислотные замены были идентифипированы B белке (DQ195068.1), которыи свазываетси с 
әлементом дегидратапии, спепифиҹным MIA генома D, B областих, уҹаствукттих B формировании 
сигнала а=дерноИ локализапии (NLS) и структуры 2-спирали. Полуҹенные резулыгаты могут статы 
науҹноИ OCHOBOİ JIA будутих лабораторных исследовании генов Dreb y ппенипы. 
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